04:00
2026-05-26
arxiv.org
generative-ai
Diff-Instruct with Diffused Reward: Towards Principled One-step Generator RL
Researchers have developed Diff-Instruct with Diffused Reward (DIDR), a data-free trajectory-level alignment framework that propagates reward-tilted clean-image distributions across all noise levels tโฆ